K-Nearest Neighbor Estimation of Forest Attributes: Improving Mapping Efficiency
نویسندگان
چکیده
—This paper describes our efforts in refining k-nearest neighbor forest attributes classification using U.S. Department of Agriculture Forest Service Forest Inventory and Analysis plot data and Landsat 7 Enhanced Thematic Mapper Plus imagery. The analysis focuses on FIA-defined forest type classification across St. Louis County in northeastern Minnesota. We outline three steps in the classification process that highlight improvements in mapping efficiency: (1) using transformed divergence for spectral feature selection, (2) applying a mathematical rule for reducing the nearest neighbor search set, and (3) using a database to reduce redundant nearest neighbor searches. Our trials suggest that when combined, these approaches can reduce mapping time by half without significant loss of accuracy. The k-nearest neighbor (kNN) multisource inventory has proved timely, cost-efficient, and accurate in the Nordic countries and initial U.S trials. (Franco-Lopez et al. 2001, Haapanen et al. 2004, McRoberts et al. 2002). This approach for extending field point inventories is ideally suited to the estimation and monitoring needs of Federal agencies, such as the U.S. Department of Agriculture (USDA) Forest Service, that conduct natural and agricultural resource inventories. It provides wall-towall maps of forest attributes, retains the natural data variation found in the field inventory (unlike many parametric algorithms), and provides precise and localized estimates in common metrics across large areas and various ownerships. At a pixel-level classification, the kNN algorithm assigns each unknown (target) pixel the field attributes of the most similar reference pixels for which field data exists. Similarity is defined in terms of the feature space, typically measured as Euclidean or Mahalanobis distance between spectral features. The kNN algorithm is not mathematically complex; however, using multiple image dates and features from each date, along with several thousand field reference observations, makes kNN pixel-based mapping of large areas very inefficient. Specifically, the kNN classification approximates to F·N distance calculations, where F is the number of pixels to classify and N is the number of references. For example, standard kNN mapping of a 1.3 x 106 ha area, with a pixel resolution of 30 m2, and approximately 1,500 FIA field reference observations requires about 22 billion distance calculations and around 16 hours to process on a Pentium 4, single-processor computer. Our study examined using USDA Forest Service Forest Inventory and Analysis (FIA) plot data and Landsat 7 Enhanced Thematic Mapper Plus (ETM+) imagery in kNN classification of FIA-defined forest types. Specific emphasis is placed on improving mapping efficiency by reducing classification feature space, decreasing the number of distance calculations in the nearest neighbor search, and eliminating redundancy in redundant nearest neighbor searches by building a database of feature patterns associated with different forest type classes.
منابع مشابه
Estimation of Density using Plotless Density Estimator Criteria in Arasbaran Forest
Sampling methods have a theoretical basis and should be operational in different forests; therefore selecting an appropriate sampling method is effective for accurate estimation of forest characteristics. The purpose of this study was to estimate the stand density (number per hectare) in Arasbaran forest using a variety of the plotless density estimators of the nearest neighbors sampling me...
متن کاملEfficient k-nearest neighbor searches for multi-source forest attribute mapping
In this study, we explore the utility of data structures that facilitate efficient nearest neighbor searches for application in multi-source forest attribute prediction. Our trials suggest that the kd-tree in combination with exact search algorithms can greatly reduce nearest neighbor search time. Further, given our trial data, we found that enormous gain in search time efficiency, afforded by ...
متن کاملSoftware Cost Estimation by a New Hybrid Model of Particle Swarm Optimization and K-Nearest Neighbor Algorithms
A successful software should be finalized with determined and predetermined cost and time. Software is a production which its approximate cost is expert workforce and professionals. The most important and approximate software cost estimation (SCE) is related to the trained workforce. Creative nature of software projects and its abstract nature make extremely cost and time of projects difficult ...
متن کاملAsymptotic Behaviors of Nearest Neighbor Kernel Density Estimator in Left-truncated Data
Kernel density estimators are the basic tools for density estimation in non-parametric statistics. The k-nearest neighbor kernel estimators represent a special form of kernel density estimators, in which the bandwidth is varied depending on the location of the sample points. In this paper, we initially introduce the k-nearest neighbor kernel density estimator in the random left-truncatio...
متن کاملA Novel and Efficient KNN using Modified Apriori Algorithm
In the field of data mining, classification and association set rules are two of very important techniques to find out new patterns. K-nearest neighbor and apriori algorithm are most usable methods of classification and association set rules respectively. However, individually they face few challenges, such as, time utilization and inefficiency for very large databases. The current paper attemp...
متن کامل